智能论文笔记

Z-Index at CheckThat! Lab 2022: Check-Worthiness Identification on Tweet Text

Prerona Tarannum , Firoj Alam , Md. Arid Hasan , Sheak Rashed Haider Noori

分类：自然语言处理 | 机器学习

2022-07-15

社交媒体和数字技术的广泛使用促进了有关事件和活动的各种新闻和信息。尽管分享了积极的信息误导和虚假信息，但社交媒体也正在传播。在确定人类专家和自动工具手动的这种误导性信息方面，已经做出了努力。由于包含事实主张的大量信息正在网上出现，手动努力并不能很好地扩展。因此，自动确定值得支票的主张对于人类专家来说非常有用。在这项研究中，我们描述了我们参与子任务-1a：checkthat的推文（英语，荷兰语和西班牙语）的值得检查！在CLEF 2022的实验室。我们执行了标准的预处理步骤，并应用了不同的模型来确定给定文本是否值得事实检查。我们使用过度采样技术来平衡数据集和应用SVM和随机森林（RF）和TF-IDF表示。我们还将BERT多语言（BERT-M）和XLM-ROBERTA-BASE预培训模型用于实验。我们将BERT-M用于官方提交，我们的系统分别在西班牙语，荷兰语和英语中分别排名第三，第五和第十二。在进一步的实验中，我们的评估表明，变压器模型（Bert-M和XLM-Roberta-bas）在荷兰语和英语语言中优于SVM和RF，在荷兰语和英语中，对于西班牙来说，观察到不同的情况。

translated by 谷歌翻译

MEDIC: A Multi-Task Learning Dataset for Disaster Image Classification

Firoj Alam , Tanvirul Alam , Md. Arid Hasan , Abul Hasnat , Muhammad Imran , Ferda Ofli

分类：计算机视觉 | 机器学习

2021-08-29

最近在灾害信息学的研究证明了人工智能的实用而重要的用例，以拯救人类生命和基于社交媒体内容（文本和图像）的自然灾害期间的痛苦。虽然使用文本的显着进度，但利用图像的研究仍然相对较低。要提前基于图像的方法，我们提出了Medic（可用于：https://crisisnlp.qcri.org/medic/index.html），这是人道主义响应的最大社交媒体图像分类数据集，由71,198个图像组成在多任务学习设置中的四个不同任务。这是它的第一个数据集：社交媒体图像，灾难响应和多任务学习研究。该数据集的一个重要属性是它的高潜力，可以为多任务学习进行贡献，该研究最近从机器学习界获得了很多兴趣，并在内存，推理速度，性能和泛化能力方面显示出显着的结果。因此，所提出的数据集是用于推进基于图像的灾害管理和多任务机器学习研究的重要资源。

translated by 谷歌翻译

A Comparison Study of Deep CNN Architecture in Detecting of Pneumonia

Al Mohidur Rahman Porag , Md. Mahedi Hasan , Dr. Md Taimur Ahad

分类：计算机视觉 | 机器学习

2022-12-30

Pneumonia, a respiratory infection brought on by bacteria or viruses, affects a large number of people, especially in developing and impoverished countries where high levels of pollution, unclean living conditions, and overcrowding are frequently observed, along with insufficient medical infrastructure. Pleural effusion, a condition in which fluids fill the lung and complicate breathing, is brought on by pneumonia. Early detection of pneumonia is essential for ensuring curative care and boosting survival rates. The approach most usually used to diagnose pneumonia is chest X-ray imaging. The purpose of this work is to develop a method for the automatic diagnosis of bacterial and viral pneumonia in digital x-ray pictures. This article first presents the authors' technique, and then gives a comprehensive report on recent developments in the field of reliable diagnosis of pneumonia. In this study, here tuned a state-of-the-art deep convolutional neural network to classify plant diseases based on images and tested its performance. Deep learning architecture is compared empirically. VGG19, ResNet with 152v2, Resnext101, Seresnet152, Mobilenettv2, and DenseNet with 201 layers are among the architectures tested. Experiment data consists of two groups, sick and healthy X-ray pictures. To take appropriate action against plant diseases as soon as possible, rapid disease identification models are preferred. DenseNet201 has shown no overfitting or performance degradation in our experiments, and its accuracy tends to increase as the number of epochs increases. Further, DenseNet201 achieves state-of-the-art performance with a significantly a smaller number of parameters and within a reasonable computing time. This architecture outperforms the competition in terms of testing accuracy, scoring 95%. Each architecture was trained using Keras, using Theano as the backend.

translated by 谷歌翻译

DRG-Net: Interactive Joint Learning of Multi-lesion Segmentation and Classification for Diabetic Retinopathy Grading

Hasan Md Tusfiqur , Duy M. H. Nguyen , Mai T. N. Truong , Triet A. Nguyen , Binh T. Nguyen , Michael Barz , Hans-Juergen Profitlich , Ngoc T. T. Than , Ngan Le , Pengtao Xie

分类：计算机视觉

2022-12-30

Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment. In this work, we leverage interactive machine learning and introduce a joint learning framework, termed DRG-Net, to effectively learn both disease grading and multi-lesion segmentation. Our DRG-Net consists of two modules: (i) DRG-AI-System to classify DR Grading, localize lesion areas, and provide visual explanations; (ii) DRG-Expert-Interaction to receive feedback from user-expert and improve the DRG-AI-System. To deal with sparse data, we utilize transfer learning mechanisms to extract invariant feature representations by using Wasserstein distance and adversarial learning-based entropy minimization. Besides, we propose a novel attention strategy at both low- and high-level features to automatically select the most significant lesion information and provide explainable properties. In terms of human interaction, we further develop DRG-Net as a tool that enables expert users to correct the system's predictions, which may then be used to update the system as a whole. Moreover, thanks to the attention mechanism and loss functions constraint between lesion features and classification features, our approach can be robust given a certain level of noise in the feedback of users. We have benchmarked DRG-Net on the two largest DR datasets, i.e., IDRID and FGADR, and compared it to various state-of-the-art deep learning networks. In addition to outperforming other SOTA approaches, DRG-Net is effectively updated using user feedback, even in a weakly-supervised manner.

translated by 谷歌翻译

Efficient Movie Scene Detection using State-Space Transformers

Md Mohaiminul Islam , Mahmudul Hasan , Kishan Shamsundar Athrey , Tony Braskich , Gedas Bertasius

分类：计算机视觉

2022-12-29

The ability to distinguish between different movie scenes is critical for understanding the storyline of a movie. However, accurately detecting movie scenes is often challenging as it requires the ability to reason over very long movie segments. This is in contrast to most existing video recognition models, which are typically designed for short-range video analysis. This work proposes a State-Space Transformer model that can efficiently capture dependencies in long movie videos for accurate movie scene detection. Our model, dubbed TranS4mer, is built using a novel S4A building block, which combines the strengths of structured state-space sequence (S4) and self-attention (A) layers. Given a sequence of frames divided into movie shots (uninterrupted periods where the camera position does not change), the S4A block first applies self-attention to capture short-range intra-shot dependencies. Afterward, the state-space operation in the S4A block is used to aggregate long-range inter-shot cues. The final TranS4mer model, which can be trained end-to-end, is obtained by stacking the S4A blocks one after the other multiple times. Our proposed TranS4mer outperforms all prior methods in three movie scene detection datasets, including MovieNet, BBC, and OVSD, while also being $2\times$ faster and requiring $3\times$ less GPU memory than standard Transformer models. We will release our code and models.

translated by 谷歌翻译

Brain Cancer Segmentation Using YOLOv5 Deep Neural Network

Sudipto Paul , Dr. Md Taimur Ahad , Md. Mahedi Hasan

分类：计算机视觉

2022-12-27

An expansion of aberrant brain cells is referred to as a brain tumor. The brain's architecture is extremely intricate, with several regions controlling various nervous system processes. Any portion of the brain or skull can develop a brain tumor, including the brain's protective coating, the base of the skull, the brainstem, the sinuses, the nasal cavity, and many other places. Over the past ten years, numerous developments in the field of computer-aided brain tumor diagnosis have been made. Recently, instance segmentation has attracted a lot of interest in numerous computer vision applications. It seeks to assign various IDs to various scene objects, even if they are members of the same class. Typically, a two-stage pipeline is used to perform instance segmentation. This study shows brain cancer segmentation using YOLOv5. Yolo takes dataset as picture format and corresponding text file. You Only Look Once (YOLO) is a viral and widely used algorithm. YOLO is famous for its object recognition properties. You Only Look Once (YOLO) is a popular algorithm that has gone viral. YOLO is well known for its ability to identify objects. YOLO V2, V3, V4, and V5 are some of the YOLO latest versions that experts have published in recent years. Early brain tumor detection is one of the most important jobs that neurologists and radiologists have. However, it can be difficult and error-prone to manually identify and segment brain tumors from Magnetic Resonance Imaging (MRI) data. For making an early diagnosis of the condition, an automated brain tumor detection system is necessary. The model of the research paper has three classes. They are respectively Meningioma, Pituitary, Glioma. The results show that, our model achieves competitive accuracy, in terms of runtime usage of M2 10 core GPU.

translated by 谷歌翻译

A Dependable Hybrid Machine Learning Model for Network Intrusion Detection

Md. Alamin Talukder , Khondokar Fida Hasan , Md. Manowarul Islam , Md Ashraf Uddin , Arnisha Akhter , Mohammand Abu Yousuf , Fares Alharbi , Mohammad Ali Moni

分类：机器学习

2022-12-08

Network intrusion detection systems (NIDSs) play an important role in computer network security. There are several detection mechanisms where anomaly-based automated detection outperforms others significantly. Amid the sophistication and growing number of attacks, dealing with large amounts of data is a recognized issue in the development of anomaly-based NIDS. However, do current models meet the needs of today's networks in terms of required accuracy and dependability? In this research, we propose a new hybrid model that combines machine learning and deep learning to increase detection rates while securing dependability. Our proposed method ensures efficient pre-processing by combining SMOTE for data balancing and XGBoost for feature selection. We compared our developed method to various machine learning and deep learning algorithms to find a more efficient algorithm to implement in the pipeline. Furthermore, we chose the most effective model for network intrusion based on a set of benchmarked performance analysis criteria. Our method produces excellent results when tested on two datasets, KDDCUP'99 and CIC-MalMem-2022, with an accuracy of 99.99% and 100% for KDDCUP'99 and CIC-MalMem-2022, respectively, and no overfitting or Type-1 and Type-2 issues.

translated by 谷歌翻译

Hybrid Parallel Imaging and Compressed Sensing MRI Reconstruction with GRAPPA Integrated Multi-loss Supervised GAN

Farhan Sadik , Md. Kamrul Hasan

分类：计算机视觉

2022-09-19

目的：并行成像通过用一系列接收器线圈获取其他灵敏度信息，从而加速了磁共振成像（MRI）数据，从而降低了相位编码步骤。压缩传感磁共振成像（CS-MRI）在医学成像领域中获得了普及，因为其数据要求较少，而不是平行成像。并行成像和压缩传感（CS）均通过最大程度地减少K空间中捕获的数据量来加快传统MRI获取。由于采集时间与样品的数量成反比，因此从缩短的K空间样品中的图像的反向形成会导致收购更快，但具有混乱的伪像。本文提出了一种新型的生成对抗网络（GAN），即雷德格尔（Recgan-gr）受到多模式损失的监督，以消除重建的图像。方法：与现有的GAN网络相反，我们提出的方法引入了一种新型的发电机网络，即与双域损耗函数集成的弹药网络，包括加权幅度和相位损耗函数以及基于平行成像的损失，即Grappa一致性损失。提出了K空间校正块，以使GAN网络自动化生成不必要的数据，从而使重建过程的收敛性更快。结果：全面的结果表明，拟议的Recgan-GR在基于GAN的方法中的PSNR有4 dB的改善，并且在文献中可用的传统最先进的CNN方法中有2 dB的改进。结论和意义：拟议的工作有助于显着改善低保留数据的图像质量，从而更快地获取了5倍或10倍。

translated by 谷歌翻译

Traffic Congestion Prediction using Deep Convolutional Neural Networks: A Color-coding Approach

Mirza Fuad Adnan , Nadim Ahmed , Imrez Ishraque , Md. Sifath Al Amin , Md. Sumit Hasan

分类：计算机视觉 | 人工智能

2022-09-16

由于计算机视觉的最新进展，流量视频数据已成为限制交通拥堵状况的关键因素。这项工作为使用颜色编码方案提供了一种独特的技术，用于在深度卷积神经网络中训练流量数据之前。首先，将视频数据转换为图像数据集。然后，使用您只看一次算法进行车辆检测。已经采用了颜色编码的方案将图像数据集转换为二进制图像数据集。这些二进制图像被馈送到深度卷积神经网络中。使用UCSD数据集，我们获得了98.2％的分类精度。

translated by 谷歌翻译

BON: An extended public domain dataset for human activity recognition

Girmaw Abebe Tadesse , Oliver Bent , Komminist Weldemariam , Md. Abrar Istiak , Taufiq Hasan , Andrea Cavallaro

分类：计算机视觉

2022-09-12

人体戴的第一人称视觉（FPV）摄像头使从受试者的角度提取有关环境的丰富信息来源。然而，与其他活动环境（例如厨房和室外卧床）相比，基于可穿戴摄像头的eg中心办公室活动的研究进展速度很慢，这主要是由于缺乏足够的数据集来培训更复杂的（例如，深度学习）模型的模型在办公环境中的人类活动识别。本文提供了使用胸部安装的GoPro Hero摄像机，提供了三个地理位置的不同办公室设置中收集的大型公开办公活动数据集（BON）：巴塞罗那（西班牙），牛津（英国）和内罗毕（肯尼亚）。 BON数据集包含十八个常见的办公活动，可以将其分为人与人之间的互动（例如与同事聊天），人对象（例如，在白板上写作）和本体感受（例如，步行）。为5秒钟的视频段提供注释。通常，BON包含25个受试者和2639个分段。为了促进子域中的进一步研究，我们还提供了可以用作未来研究基准的结果。

translated by 谷歌翻译